Improve response parsing: no ptypes, faster datetimes by karawoo · Pull Request #513 · posit-dev/connectapi

karawoo · 2026-03-05T20:13:50Z

Intent

Fixes #483

Approach

Removes the connectapi_ptypes dictionary and vctrs-based type coercion system (ensure_columns, ensure_column, vec_cast). Instead, each getter function declares its own datetime_cols and applies lightweight post-parse coercion via coerce_datetime(), which handles RFC 3339 strings, epoch seconds, POSIXct pass-through, and all-NA columns.

Parsing pipeline:

Connect$request() now uses jsonlite::fromJSON() instead of httr::content(as = "parsed"), giving us control over jsonlite's simplification behavior
parse_connect_rfc3339() uses a vectorized substr-based parser (faster than strptime on large vectors)
page_cursor() fetches subsequent pages with simplify=TRUE so jsonlite builds data frames in C, then combines with vctrs::vec_rbind()

I started this work while trying to improve performance of get_usage_static() (see profiling on #501 (comment)). With this branch, get_usage_static() is ~20% faster than on main for 175k records.

Checklist

Does this change update NEWS.md (referencing the connected issue if necessary)?
Does this change need documentation? Have you run devtools::document()?

jonkeane · 2026-03-05T22:16:09Z

R/content.R

+  # Keep only the columns relevant to job termination; the API response
+  # includes extra fields (e.g. payload, guid) on error that vary by outcome.
+  keep <- c("app_id", "app_guid", "job_key", "job_id", "result", "code", "error")


Is it a problem that we get variable data at this point?

I believe that the variable data is for error cases (i.e. trying to terminate a job that is not currently active, which returns a 409 but doesn't raise an R error) vs. successful requests. I guess it's debatable whether we should be raising an R error more eagerly, but I think the behavior of accommodating different field names is consistent with main.

The one difference here is that main will always return the columns of keep whereas on this branch if any columns from keep are missing across all responses, they'd be omitted. I'll update to make it more consistent with main.

I was actually thinking a bit in the opposite direction: if we sometimes get different fieldnames that's probably totally ok. This is probably more important (or really, more possible) when we get to having these objects not be DFs that get passed around. Then the normalization of "here are the columns you're getting" can be done at the as.data.frame() point). We don't need to do this here, it just smelled a little funny to me

Gotcha yeah, if we were returning a list then I agree it'd make more sense to not worry about the columns -- the responses are what they are, and that may include different fields.

jonkeane · 2026-04-01T20:47:21Z

R/content.R

+  # The older /applications/ endpoint returns timestamps as Unix epoch integers
+  # and ID fields as integers. Normalize to match the v1 endpoint's types.
+  # For the v1 endpoint these are already character/POSIXct, so the coercions
+  # are no-ops.


It's probably not worth too much digging, but if you know off the top of your head: how old are older versions here?

GET /v1/content/{guid}/jobs was added in October 2022 https://docs.posit.co/connect/news/#rstudio-connect-2022.10.0; the old applications endpoint was removed July 2025.

jonkeane · 2026-04-01T20:48:52Z

R/content.R


-  parse_connectapi_typed(bundles, connectapi_ptypes$bundles)
+  out <- parse_connectapi_typed(bundles, datetime_cols = "created_time")
+  coerce_fs_bytes(out, "size")


We can keep this here, but I do wonder if providing these as fs::byte types is really useful?

jonkeane · 2026-04-01T20:53:20Z

tests/integrated/test-lazy.R

+  # No shiny apps are deployed in integration tests, so this may be empty.
+  if (nrow(shiny_usage_local) > 0) {
+    expect_gt(length(colnames(shiny_usage)), 1)
+    expect_true("content_guid" %in% names(shiny_usage_local))
+    expect_s3_class(shiny_usage_local$started, "POSIXct")
+  }
 })


I assume the if here is because a 0-row df doesn't get coerced to this type? But is that true / is that what we want?

if there are no shiny apps then the usage data returns an empty response with no rows OR columns, so we can't assert the column names/types

Ah, hmmm that seems not ideal, but if it's not a regression in this work that's fine. But maybe a follow up would be good?

It is a regression in this work. Previously we would have been able to construct a 0-row data frame with
the expected columns based on the ptypes. Since we want to remove the ptypes, we now have no way to know what columns to create. The regression here is inherent to the motivation of the PR. I'm not sure what follow-up we could do that wouldn't bring back ptypes in some form.

Aaah ok, I get it now: the response from the server is a different shape if there are no visits to shiny apps to report, yeah?

I did a bit of poking and it looks like that is also common across other endpoints too (e.g. content search returns an empty list when there are no matches). So that will have a similar issue too.

jonkeane · 2026-04-01T20:55:06Z

tests/testthat/test-get.R

-        id = c("8966707", "8966708", "8967206", "8967210", "8966214"),
+        id = c(8966707L, 8966708L, 8967206L, 8967210L, 8966214L),


Are these supposed to remain characters? I believe they should (given how we do our IDs), but maybe I'm missing something?

they changed because now we return whatever the API returns rather than trying to coerce to specific types, and the test fixtures have integers rather than characters. I'll update those but the behavior here is correct, and only seems confusing because these are unit tests that can get out of sync with connect's actual response types.

jonkeane

Sorry I commented outside of a review so that I didn't accidentally leave them without sending. Overall this looks good, and I'm pro getting this on main and using it locally to confirm it works well.

Do you think it would be a good thing to add a test or two that simulates the changes that we had to do with #512 ?

…rned from the connect server

karawoo requested a review from jonkeane March 5, 2026 20:13

karawoo mentioned this pull request Mar 5, 2026

Remove strict type checking #511

Closed

2 tasks

jonkeane reviewed Mar 5, 2026

View reviewed changes

jonkeane reviewed Apr 1, 2026

View reviewed changes

karawoo added 9 commits April 2, 2026 10:40

improve performance of parse_connectapi_typed

385d5cf

remove ptype and simplify parsing

61f93be

improve performance for page_cursor

b2537b4

keep columns even if missing from all responses

04b0840

add some tests for page_cursor behavior with mismatched types

14518ee

dry up datetimes

ced1d4f

update test to use character IDs from newer connect version

f83255d

add test that confirms we support both integer and character IDs retu…

6cffbef

…rned from the connect server

add news

770d862

karawoo force-pushed the kara-parsing branch from 94237b4 to 770d862 Compare April 2, 2026 17:55

Merge branch 'main' into kara-parsing

9806f1f

		id = c("8966707", "8966708", "8967206", "8967210", "8966214"),
		id = c(8966707L, 8966708L, 8967206L, 8967210L, 8966214L),

Conversation

karawoo commented Mar 5, 2026

Intent

Approach

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonkeane left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants